Skip to content

fix(sync): repair persisted Codex goal-context rows#874

Merged
wesm merged 5 commits into
kenn-io:mainfrom
tpn:868-codex-goal-awareness
Jun 28, 2026
Merged

fix(sync): repair persisted Codex goal-context rows#874
wesm merged 5 commits into
kenn-io:mainfrom
tpn:868-codex-goal-awareness

Conversation

@tpn

@tpn tpn commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Refinement on top of #886 / 9bee8a3, which already taught the Codex parser to classify /goal continuation envelopes as system content for newly parsed sessions.

This PR handles the archive and compatibility side of that fix. It bumps the parser data version to 56 on top of main's version-55 Kimi usage-event bump, so existing source-backed Codex sessions are non-destructively re-parsed and old stored /goal rows are removed from message lists and user-turn counts instead of remaining in persistent SQLite archives.

It also adds read-path fallbacks for legacy rows that may still exist before reparse, or in rows that cannot be re-emitted from source files. Backend system-prefix filtering now uses a dedicated Codex goal-context predicate plus SQLite/PostgreSQL/DuckDB SQL variants, while frontend system-message detection and HTML/focused export use the same wrapper semantics. Both the current <codex_internal_context ... source="goal"> form, including extra attributes, and the older <goal_context> wrapper stay out of search, transcript filtering, and exported sessions.

The main tradeoff is the data-version bump: multi-host PostgreSQL deployments should upgrade pg serve and all pushers together, because older agentsview binaries reject rows written by a newer parser data version.

@tpn tpn marked this pull request as ready for review June 26, 2026 01:57
@roborev-ci

roborev-ci Bot commented Jun 26, 2026

Copy link
Copy Markdown

roborev: Combined Review (69c4679)

HTML export has one medium-severity visibility regression; security review found no issues.

Medium

  • Location: internal/server/export.go:678
    Problem: The new HTML export filter drops every db.IsSystemPrefixed user message, which also removes promoted Claude boundary rows such as continuations, interruptions, task notifications, and stop-hook feedback that the SPA intentionally keeps visible via source_subtype.
    Fix: Restrict the export filter to the Codex goal-context prefixes, or mirror the frontend visibility rules by preserving visible system subtypes before applying prefix fallback.

Panel: ci_default_security | Synthesis: codex, 6s | Members: codex_default (codex/default, done, 3m55s), codex_security (codex/security, done, 58s) | Total: 4m59s

@wesm

wesm commented Jun 26, 2026

Copy link
Copy Markdown
Member

This was probably fixed by #886, let me rebase this and see

@tpn tpn force-pushed the 868-codex-goal-awareness branch from b7335d0 to f1db0c0 Compare June 26, 2026 18:32
@tpn tpn changed the title Fix Codex goal continuation transcript handling Add legacy Codex goal-context fallbacks Jun 26, 2026
@wesm wesm force-pushed the 868-codex-goal-awareness branch from f1db0c0 to 68ff8ab Compare June 26, 2026 18:55
@wesm wesm changed the title Add legacy Codex goal-context fallbacks fix(sync): repair persisted Codex goal-context rows Jun 26, 2026
@roborev-ci

roborev-ci Bot commented Jun 26, 2026

Copy link
Copy Markdown

roborev: Combined Review (68ff8ab)

Summary verdict: One medium issue remains; no high or critical findings were reported.

Medium

  • internal/server/export.go:678: filterExportHTMLMessages now drops every db.IsSystemPrefixed user message, not just Codex goal context wrappers. That can also remove Claude promoted system-boundary messages such as continuation, interruption, task notifications, and stop-hook feedback, which the frontend keeps visible via source_subtype. HTML exports may silently omit visible transcript events.
    • Fix: Narrow the export filter to only goal-context wrappers, or make it respect visible system subtypes the same way the frontend does.

Panel: ci_default_security | Synthesis: codex, 6s | Members: codex_default (codex/default, done, 3m23s), codex_security (codex/security, done, 10s) | Total: 3m39s

@tpn tpn force-pushed the 868-codex-goal-awareness branch from 68ff8ab to f1b131a Compare June 26, 2026 21:18
@tpn

tpn commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

Addressed the RoboRev export finding in f1b131ae: filterExportHTMLMessages and Focused export now use a Codex goal-context-specific predicate instead of db.IsSystemPrefixed, so non-goal system-prefixed rows such as continuation/task/stop-hook messages remain in HTML exports. Added DB and export regression coverage for the narrowed behavior.

@roborev-ci

roborev-ci Bot commented Jun 26, 2026

Copy link
Copy Markdown

roborev: Combined Review (f1b131a)

Medium finding only: goal-context filtering is inconsistent across code paths.

Medium

  • internal/db/search.go:29 - IsGoalContextPrefixed recognizes <codex_internal_context ... source="goal"> with extra attributes, but SystemMsgPrefixes and the frontend only match the exact <codex_internal_context source="goal"> prefix. A stored/read-only row like <codex_internal_context foo="bar" source="goal">... can be omitted from HTML export while still appearing in search/UI/MCP paths that rely on IsSystemPrefixed or SystemPrefixSQL.

    Suggested fix: Reuse the same goal-context matching semantics in Go callers where possible, and add equivalent SQL/TypeScript matching for an opening tag with a real source="goal" attribute regardless of attribute order or extra attributes.


Panel: ci_default_security | Synthesis: codex, 9s | Members: codex_default (codex/default, done, 3m48s), codex_security (codex/security, done, 38s) | Total: 4m35s

tpn and others added 4 commits June 26, 2026 16:55
Codex goal context records should be treated as archived system context instead of ordinary transcript content, otherwise legacy metadata can leak into transcript rendering and export/search surfaces. Keeping the parser classification explicit gives downstream filtering a stable boundary for hiding those rows without losing the archived messages.

- fix(transcript): hide legacy Codex goal context rows
- merge: sync with origin/main
The parser-level /goal classification already landed in 9bee8a3, so this branch should only carry the stale persisted-row repair path. Removing the repeated parser refactor and duplicate tests keeps the remaining diff focused on data-version resync and runtime filtering for legacy stored messages.
@tpn tpn force-pushed the 868-codex-goal-awareness branch from f1b131a to 5826ee7 Compare June 26, 2026 23:59
@tpn

tpn commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

Addressed the latest RoboRev goal-context consistency finding in 5826ee7e: IsSystemPrefixed, SQLite/PostgreSQL/DuckDB SystemPrefixSQL paths, the frontend system-message helper, and HTML/focused export now share the same Codex goal-wrapper semantics for <codex_internal_context ... source="goal"> with extra attributes plus legacy <goal_context> rows. Rebased onto origin/main at e3a0af35; the parser data-version bump is now 56 because main already uses 55 for the Kimi usage-event reparse.

@roborev-ci

roborev-ci Bot commented Jun 27, 2026

Copy link
Copy Markdown

roborev: Combined Review (5826ee7)

Summary verdict: One medium correctness issue remains; no high or critical findings were reported.

Medium

  • Location: internal/db/search.go:52
  • Problem: The new matcher treats data-source="goal" as non-goal, but the Codex parser still drops any <codex_internal_context...> opening tag containing source="goal". With the data version bump forcing reparse, a non-goal internal context can be preserved in old rows/search/export but deleted from persisted messages after resync.
  • Fix: Update the Codex parser to use the same attribute-boundary matching as IsGoalContextPrefixed, and add parser tests for data-source="goal" and other non-goal internal contexts.

Panel: ci_default_security | Synthesis: codex, 6s | Members: codex_default (codex/default, done, 5m17s), codex_security (codex/security, done, 24s) | Total: 5m47s

Codex goal-context parsing still matched source="goal" as a raw substring, so wrappers such as data-source="goal" were treated as synthetic goal continuations. A data-version reparse could therefore drop messages that the search and export filters correctly preserve.

Match the source attribute with the same boundary semantics as the persisted-row matcher so only the real source="goal" attribute is suppressed.
@roborev-ci

roborev-ci Bot commented Jun 28, 2026

Copy link
Copy Markdown

roborev: Combined Review (4b8d97b)

No issues found.


Panel: ci_default_security | Synthesis: codex | Members: codex_default (codex/default, done, 6m44s), codex_security (codex/security, done, 41s) | Total: 7m25s

@wesm wesm merged commit a4584de into kenn-io:main Jun 28, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants